A Unicode-based Environment for Creation and Use of Language Resources

نویسندگان

  • Valentin Tablan
  • Cristian Ursu
  • Kalina Bontcheva
  • Hamish Cunningham
  • Diana Maynard
  • Oana Hamza
  • Tony McEnery
  • Paul Baker
  • Mark Leisher
چکیده

GATE is a Unicode-aware architecture, development environment and framework for building systems that process human language. It is often thought that the character sets problem has been solved by the arrival of the Unicode standard. This standard is an important advance, but in practice the ability to process text in a large number of the World’s languages is still limited. This paper describes work done in the context of the GATE project that makes use of Unicode and plugs some of the gaps for language processing R&D. First we look at storing and decoding of Unicode compliant linguistic resources. The new capabilities for processing textual data and taking advantage of the Unicode standard are detailed next. Finally, the solutions used to add Unicode displaying and editing capabilities for the graphical interface are described.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Internationalization of a Distance Exam Web Environment

This paper describes an architecture to provide multilingual support for an exam Web environment with an emphasis on Arabic localization (or Arabization). Developing software products for populations with different cultures is a two-step process: internationalization followed by localization. Proper Arabization is particularly complex with many interesting challenges. In this context, we develo...

متن کامل

Conservation and sustainable exploitation of plant genetic rsources: International developments

Plant Genetic Resources (PGRs) are one of the most valuable natural resources of anycountry. Biotechnology through genetic engineering of plants and the creation of new plantvarieties can increase the value of these resources. Different technical and legalmechanisms such as ex situ/in situ collection of PGRs, and Intellectual Property Rights(IPRs...

متن کامل

Towards an Inquiry-Based Language Learning: Can a Wiki Help?

Wiki use may help EFL instructors to create an effective learning environment for inquiry-based language teaching and learning. The purpose of this study was to investigate the effects of wikis on the EFL learners’ IBL process. Forty-nine EFL students participated in the study while they conducted research projects in English. The Non-wiki group (n = 25) received traditional inquiry instr...

متن کامل

GATE: an Architecture for Development of Robust HLT Applications

In this paper we present GATE, a framework and graphical development environment which enables users to develop and deploy language engineering components and resources in a robust fashion. The GATE architecture has enabled us not only to develop a number of successful applications for various language processing tasks (such as Information Extraction), but also to build and annotate corpora and...

متن کامل

A GIS-based integrative approach for land use optimization in a semi-arid watershed

The proper use of natural resources can preserve these valuable assets. In line with the management of natural resources, land use optimization can be highly useful. The aim of the present study is to propose an appropriate integrative model for optimized allocation of lands for surface runoff and sediment load minimization and net income maximization in Bayg watershed, Iran. In this study, fiv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002